A new parallel recomputing code design methodology for fast failure recovery

نویسندگان

  • Yunfei Du
  • Yuhua Tang
  • Xinwei Xie
چکیده

As the size of large-scale computer systems increases, theirmean-time-between-failures are becoming significantly shorter than the execution time of many current scientific applications. Fault-tolerant parallel algorithm (FTPA) is an application-level fault-tolerant approach that can achieve fast self-recovery by parallel recomputing. The method of parallelizing the loops has been used to design the parallel recomputing code for FTPA in our prior work. In the present paper, we first propose a new parallel recomputing code design methodology. Second, the parallel recomputing code design methodology is automated by exploring the use of compiler technology. Finally, we evaluate the performance of our approach with five programs on Tianhe-1A. The experimental results show that the parallel recomputing code generated by the new method has a higher efficiency of parallel recomputing. 2013 Elsevier Ltd. All rights reserved.

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Mapping of McGraw Cycle to RUP Methodology for Secure Software Developing

Designing a secure software is one of the major phases in developing a robust software. The McGraw life cycle, as one of the well-known software security development approaches, implements different touch points as a collection of software security practices. Each touch point includes explicit instructions for applying security in terms of design, coding, measurement, and maintenance of softwar...

متن کامل

A Fast Strategy to Find Solution for Survivable Multicommodity ‎Network‎

This paper proposes an immediately efficient method, based on Benders Decomposition (BD), for solving the survivable capacitated network design problem. This problem involves selecting a set of arcs for building a survivable network at a minimum cost and within a satisfied flow. The system is subject to failure and capacity restriction. To solve this problem, the BD was initially proposed with ...

متن کامل

Optimal Design of Heterogeneous Series-Parallel Systems with Common-Cause Failures

Abstract: The rapid advancements in science and technology aggravate the need for the optimal design of modern systems, aiming to achieve the maximum system reliability while meeting some resource constraints (e.g., cost, weight, etc). In this paper, we consider the system reliability optimization for series-parallel systems subject to commoncause failures (CCF). Unlike traditional approaches t...

متن کامل

Estimation of Plunge Value in Single- or Multi-Layered Anisotropic Media Using Analysis of Fast Polarization Direction of Shear Waves

Estimation of the fast polarization direction of shear seismic waves that deviate from horizontal axis is a valuable approach to investigate the characteristics of the lower crust and uppermost mantle structures. The lattice preferred orientation of crystals, which is generally parallel to the downward or upward flow of the mantle or crust, is an important reason for the occurrence of fast axis...

متن کامل

Environment as a Pattern for Design. Case study: Shandiz valley in Mashhad - Iran

The aim of this paper is to review the created problems and failure of environmental design patterns in designing process by designers. Not respectfully addressing the context of environment during designing process have resulted in the loss quality of environment, damage to both nature and essence of environment as manifested in the case study area. Methodology of the research was based on the...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

عنوان ژورنال:
  • Computers & Electrical Engineering

دوره 39  شماره 

صفحات  -

تاریخ انتشار 2013